Overview

Dataset statistics

Number of variables25
Number of observations3953
Missing cells1047
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory772.2 KiB
Average record size in memory200.0 B

Variable types

Categorical15
Numeric10

Warnings

Name has a high cardinality: 3682 distinct values High cardinality
Email ID has a high cardinality: 3373 distinct values High cardinality
Dt_Applied has a high cardinality: 3953 distinct values High cardinality
University has a high cardinality: 3140 distinct values High cardinality
Zip Code has a high cardinality: 615 distinct values High cardinality
Loan Amnt is highly correlated with Funded amnt inv and 2 other fieldsHigh correlation
Funded amnt inv is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
INSTALLMENT is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
Total Paymnt is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
Loan Amnt is highly correlated with Funded amnt inv and 2 other fieldsHigh correlation
Funded amnt inv is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
INSTALLMENT is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
Total Paymnt is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
Loan Amnt is highly correlated with Funded amnt inv and 2 other fieldsHigh correlation
Funded amnt inv is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
INSTALLMENT is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
Total Paymnt is highly correlated with Loan Amnt and 2 other fieldsHigh correlation
INSTALLMENT is highly correlated with Total Paymnt and 2 other fieldsHigh correlation
Sub Grade is highly correlated with TERM and 2 other fieldsHigh correlation
Total Paymnt is highly correlated with INSTALLMENT and 3 other fieldsHigh correlation
Funded amnt inv is highly correlated with INSTALLMENT and 4 other fieldsHigh correlation
Loan Amnt is highly correlated with INSTALLMENT and 4 other fieldsHigh correlation
Verification Status is highly correlated with Funded amnt inv and 1 other fieldsHigh correlation
TERM is highly correlated with Sub Grade and 5 other fieldsHigh correlation
GRADE is highly correlated with Sub Grade and 2 other fieldsHigh correlation
Int Rate is highly correlated with Sub Grade and 2 other fieldsHigh correlation
TERM is highly correlated with GRADE and 1 other fieldsHigh correlation
GRADE is highly correlated with TERM and 1 other fieldsHigh correlation
Sub Grade is highly correlated with TERM and 1 other fieldsHigh correlation
Name has 271 (6.9%) missing values Missing
Email ID has 580 (14.7%) missing values Missing
Gender has 78 (2.0%) missing values Missing
University has 118 (3.0%) missing values Missing
Name is uniformly distributed Uniform
Email ID is uniformly distributed Uniform
Dt_Applied is uniformly distributed Uniform
University is uniformly distributed Uniform
Dt_Applied has unique values Unique
Delinq 2Yrs has 3628 (91.8%) zeros Zeros
Inq Last 6Mths has 1822 (46.1%) zeros Zeros
Revol Bal has 42 (1.1%) zeros Zeros

Reproduction

Analysis started2021-06-30 14:59:56.903909
Analysis finished2021-06-30 15:00:15.090285
Duration18.19 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Name
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct3682
Distinct (%)100.0%
Missing271
Missing (%)6.9%
Memory size31.0 KiB
Stormy Gerauld
 
1
Perkin Gomersall
 
1
Aviva Cody
 
1
Rusty Netley
 
1
Egbert Huegett
 
1
Other values (3677)
3677 

Length

Max length23
Median length14
Mean length14.03286257
Min length7

Characters and Unicode

Total characters51669
Distinct characters58
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3682 ?
Unique (%)100.0%

Sample

1st rowCalley Giron
2nd rowLinus Stud
3rd rowLorelle Ambage
4th rowAnna-diane Larrat
5th rowGill Ruske

Common Values

ValueCountFrequency (%)
Stormy Gerauld1
 
< 0.1%
Perkin Gomersall1
 
< 0.1%
Aviva Cody1
 
< 0.1%
Rusty Netley1
 
< 0.1%
Egbert Huegett1
 
< 0.1%
Elvis Farden1
 
< 0.1%
Ilka Exer1
 
< 0.1%
Starlin Aidler1
 
< 0.1%
Clerissa Branchett1
 
< 0.1%
Dicky McGunley1
 
< 0.1%
Other values (3672)3672
92.9%
(Missing)271
 
6.9%

Length

2021-06-30T18:00:15.371533image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de20
 
0.3%
le6
 
0.1%
dee5
 
0.1%
van5
 
0.1%
kerry4
 
0.1%
salomo4
 
0.1%
derril4
 
0.1%
gill4
 
0.1%
glad4
 
0.1%
imogen4
 
0.1%
Other values (6460)7355
99.2%

Most occurring characters

ValueCountFrequency (%)
e5280
 
10.2%
a4505
 
8.7%
3733
 
7.2%
n3525
 
6.8%
i3500
 
6.8%
r3444
 
6.7%
l3183
 
6.2%
o2704
 
5.2%
t2023
 
3.9%
s1723
 
3.3%
Other values (48)18049
34.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter40336
78.1%
Uppercase Letter7550
 
14.6%
Space Separator3733
 
7.2%
Other Punctuation35
 
0.1%
Dash Punctuation14
 
< 0.1%
Close Punctuation1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C678
 
9.0%
M600
 
7.9%
B596
 
7.9%
S573
 
7.6%
D479
 
6.3%
A451
 
6.0%
G443
 
5.9%
R400
 
5.3%
L395
 
5.2%
H326
 
4.3%
Other values (16)2609
34.6%
Lowercase Letter
ValueCountFrequency (%)
e5280
13.1%
a4505
11.2%
n3525
 
8.7%
i3500
 
8.7%
r3444
 
8.5%
l3183
 
7.9%
o2704
 
6.7%
t2023
 
5.0%
s1723
 
4.3%
d1395
 
3.5%
Other values (16)9054
22.4%
Other Punctuation
ValueCountFrequency (%)
'32
91.4%
.2
 
5.7%
;1
 
2.9%
Space Separator
ValueCountFrequency (%)
3733
100.0%
Dash Punctuation
ValueCountFrequency (%)
-14
100.0%
Close Punctuation
ValueCountFrequency (%)
]1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin47886
92.7%
Common3783
 
7.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e5280
 
11.0%
a4505
 
9.4%
n3525
 
7.4%
i3500
 
7.3%
r3444
 
7.2%
l3183
 
6.6%
o2704
 
5.6%
t2023
 
4.2%
s1723
 
3.6%
d1395
 
2.9%
Other values (42)16604
34.7%
Common
ValueCountFrequency (%)
3733
98.7%
'32
 
0.8%
-14
 
0.4%
.2
 
0.1%
]1
 
< 0.1%
;1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII51669
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e5280
 
10.2%
a4505
 
8.7%
3733
 
7.2%
n3525
 
6.8%
i3500
 
6.8%
r3444
 
6.7%
l3183
 
6.2%
o2704
 
5.2%
t2023
 
3.9%
s1723
 
3.3%
Other values (48)18049
34.9%

Email ID
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct3373
Distinct (%)100.0%
Missing580
Missing (%)14.7%
Memory size31.0 KiB
lstud1@washington.edu
 
1
tveregan4t@tamu.edu
 
1
tdilnotfp@aol.com
 
1
ndoughty3w@google.com.br
 
1
ebrownfieldn0@php.net
 
1
Other values (3368)
3368 

Length

Max length35
Median length22
Mean length21.83308627
Min length11

Characters and Unicode

Total characters73643
Distinct characters39
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3373 ?
Unique (%)100.0%

Sample

1st rowcgiron0@ehow.com
2nd rowlstud1@washington.edu
3rd rowlambage2@wix.com
4th rowalarrat3@economist.com
5th rowemacfaul5@theatlantic.com

Common Values

ValueCountFrequency (%)
lstud1@washington.edu1
 
< 0.1%
tveregan4t@tamu.edu1
 
< 0.1%
tdilnotfp@aol.com1
 
< 0.1%
ndoughty3w@google.com.br1
 
< 0.1%
ebrownfieldn0@php.net1
 
< 0.1%
jfleetwood1m@google.com1
 
< 0.1%
knormington1@amazon.co.uk1
 
< 0.1%
cmillmoebf@arizona.edu1
 
< 0.1%
lmccahey5s@addthis.com1
 
< 0.1%
lclauspo@networksolutions.com1
 
< 0.1%
Other values (3363)3363
85.1%
(Missing)580
 
14.7%

Length

2021-06-30T18:00:15.735560image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
falywen26@columbia.edu1
 
< 0.1%
rlagem3@fda.gov1
 
< 0.1%
tdilnotfp@aol.com1
 
< 0.1%
ndoughty3w@google.com.br1
 
< 0.1%
ebrownfieldn0@php.net1
 
< 0.1%
jfleetwood1m@google.com1
 
< 0.1%
knormington1@amazon.co.uk1
 
< 0.1%
cmillmoebf@arizona.edu1
 
< 0.1%
lmccahey5s@addthis.com1
 
< 0.1%
lclauspo@networksolutions.com1
 
< 0.1%
Other values (3363)3363
99.7%

Most occurring characters

ValueCountFrequency (%)
o6259
 
8.5%
e5765
 
7.8%
c4676
 
6.3%
a4491
 
6.1%
m4076
 
5.5%
.3699
 
5.0%
r3657
 
5.0%
n3612
 
4.9%
i3589
 
4.9%
@3373
 
4.6%
Other values (29)30446
41.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter64215
87.2%
Other Punctuation7072
 
9.6%
Decimal Number2273
 
3.1%
Dash Punctuation83
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o6259
 
9.7%
e5765
 
9.0%
c4676
 
7.3%
a4491
 
7.0%
m4076
 
6.3%
r3657
 
5.7%
n3612
 
5.6%
i3589
 
5.6%
l3340
 
5.2%
s3154
 
4.9%
Other values (16)21596
33.6%
Decimal Number
ValueCountFrequency (%)
2265
11.7%
1260
11.4%
3260
11.4%
6243
10.7%
4243
10.7%
8239
10.5%
5227
10.0%
9215
9.5%
7210
9.2%
0111
4.9%
Other Punctuation
ValueCountFrequency (%)
.3699
52.3%
@3373
47.7%
Dash Punctuation
ValueCountFrequency (%)
-83
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin64215
87.2%
Common9428
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
o6259
 
9.7%
e5765
 
9.0%
c4676
 
7.3%
a4491
 
7.0%
m4076
 
6.3%
r3657
 
5.7%
n3612
 
5.6%
i3589
 
5.6%
l3340
 
5.2%
s3154
 
4.9%
Other values (16)21596
33.6%
Common
ValueCountFrequency (%)
.3699
39.2%
@3373
35.8%
2265
 
2.8%
1260
 
2.8%
3260
 
2.8%
6243
 
2.6%
4243
 
2.6%
8239
 
2.5%
5227
 
2.4%
9215
 
2.3%
Other values (3)404
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII73643
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o6259
 
8.5%
e5765
 
7.8%
c4676
 
6.3%
a4491
 
6.1%
m4076
 
5.5%
.3699
 
5.0%
r3657
 
5.0%
n3612
 
4.9%
i3589
 
4.9%
@3373
 
4.6%
Other values (29)30446
41.3%

Gender
Categorical

MISSING

Distinct2
Distinct (%)0.1%
Missing78
Missing (%)2.0%
Memory size31.0 KiB
Male
1970 
Female
1905 

Length

Max length6
Median length4
Mean length4.983225806
Min length4

Characters and Unicode

Total characters19310
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowFemale
4th rowFemale
5th rowFemale

Common Values

ValueCountFrequency (%)
Male1970
49.8%
Female1905
48.2%
(Missing)78
 
2.0%

Length

2021-06-30T18:00:16.026781image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:16.105570image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
male1970
50.8%
female1905
49.2%

Most occurring characters

ValueCountFrequency (%)
e5780
29.9%
a3875
20.1%
l3875
20.1%
M1970
 
10.2%
F1905
 
9.9%
m1905
 
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter15435
79.9%
Uppercase Letter3875
 
20.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e5780
37.4%
a3875
25.1%
l3875
25.1%
m1905
 
12.3%
Uppercase Letter
ValueCountFrequency (%)
M1970
50.8%
F1905
49.2%

Most occurring scripts

ValueCountFrequency (%)
Latin19310
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e5780
29.9%
a3875
20.1%
l3875
20.1%
M1970
 
10.2%
F1905
 
9.9%
m1905
 
9.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII19310
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e5780
29.9%
a3875
20.1%
l3875
20.1%
M1970
 
10.2%
F1905
 
9.9%
m1905
 
9.9%

Dt_Applied
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct3953
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
07/03/91
 
1
14/02/89
 
1
19/11/84
 
1
14/10/83
 
1
15/05/83
 
1
Other values (3948)
3948 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters31624
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3953 ?
Unique (%)100.0%

Sample

1st row01/01/81
2nd row02/01/81
3rd row03/01/81
4th row04/01/81
5th row05/01/81

Common Values

ValueCountFrequency (%)
07/03/911
 
< 0.1%
14/02/891
 
< 0.1%
19/11/841
 
< 0.1%
14/10/831
 
< 0.1%
15/05/831
 
< 0.1%
06/05/871
 
< 0.1%
13/11/841
 
< 0.1%
05/11/891
 
< 0.1%
11/08/881
 
< 0.1%
11/10/911
 
< 0.1%
Other values (3943)3943
99.7%

Length

2021-06-30T18:00:16.358894image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
07/03/911
 
< 0.1%
14/02/891
 
< 0.1%
19/11/841
 
< 0.1%
14/10/831
 
< 0.1%
15/05/831
 
< 0.1%
06/05/871
 
< 0.1%
13/11/841
 
< 0.1%
05/11/891
 
< 0.1%
11/08/881
 
< 0.1%
11/10/911
 
< 0.1%
Other values (3943)3943
99.7%

Most occurring characters

ValueCountFrequency (%)
/7906
25.0%
05259
16.6%
84381
13.9%
14020
12.7%
22664
 
8.4%
91744
 
5.5%
31288
 
4.1%
51096
 
3.5%
71096
 
3.5%
41085
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number23718
75.0%
Other Punctuation7906
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
05259
22.2%
84381
18.5%
14020
16.9%
22664
11.2%
91744
 
7.4%
31288
 
5.4%
51096
 
4.6%
71096
 
4.6%
41085
 
4.6%
61085
 
4.6%
Other Punctuation
ValueCountFrequency (%)
/7906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common31624
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/7906
25.0%
05259
16.6%
84381
13.9%
14020
12.7%
22664
 
8.4%
91744
 
5.5%
31288
 
4.1%
51096
 
3.5%
71096
 
3.5%
41085
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII31624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/7906
25.0%
05259
16.6%
84381
13.9%
14020
12.7%
22664
 
8.4%
91744
 
5.5%
31288
 
4.1%
51096
 
3.5%
71096
 
3.5%
41085
 
3.4%

University
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct3140
Distinct (%)81.9%
Missing118
Missing (%)3.0%
Memory size31.0 KiB
Tampere Polytechnic
 
4
Fukuoka Institute of Technology
 
4
Jiangxi University of Traditional Chinese Medicine
 
4
Arab Open University
 
4
Abant Izzet Baysal University
 
4
Other values (3135)
3815 

Length

Max length114
Median length29
Mean length30.49152542
Min length11

Characters and Unicode

Total characters116935
Distinct characters98
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2542 ?
Unique (%)66.3%

Sample

1st rowWarner Southern College
2nd rowShri Lal Bahadur Shastri Rashtriya Sanskrit Vidyapeetha
3rd rowTechnische Universität Bergakademie Freiberg
4th rowDivine Word College of Legazpi
5th rowEast China Jiao Tong University

Common Values

ValueCountFrequency (%)
Tampere Polytechnic4
 
0.1%
Fukuoka Institute of Technology4
 
0.1%
Jiangxi University of Traditional Chinese Medicine4
 
0.1%
Arab Open University4
 
0.1%
Abant Izzet Baysal University4
 
0.1%
Phillips Graduate Institute4
 
0.1%
Stavropol State Technical University4
 
0.1%
Universidad de Congreso4
 
0.1%
Universidad Valle del Momboy4
 
0.1%
Carlow College4
 
0.1%
Other values (3130)3795
96.0%
(Missing)118
 
3.0%

Length

2021-06-30T18:00:16.728904image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
university2142
 
14.3%
of1144
 
7.6%
college544
 
3.6%
de397
 
2.7%
universidad307
 
2.0%
state274
 
1.8%
institute220
 
1.5%
and197
 
1.3%
technology195
 
1.3%
113
 
0.8%
Other values (3562)9447
63.1%

Most occurring characters

ValueCountFrequency (%)
11198
 
9.6%
i10758
 
9.2%
e10464
 
8.9%
n8336
 
7.1%
a7981
 
6.8%
t6906
 
5.9%
r6300
 
5.4%
o6107
 
5.2%
s5789
 
5.0%
l4376
 
3.7%
Other values (88)38720
33.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter91706
78.4%
Uppercase Letter13156
 
11.3%
Space Separator11198
 
9.6%
Other Punctuation508
 
0.4%
Dash Punctuation186
 
0.2%
Open Punctuation79
 
0.1%
Close Punctuation79
 
0.1%
Decimal Number19
 
< 0.1%
Control2
 
< 0.1%
Initial Punctuation1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i10758
11.7%
e10464
11.4%
n8336
 
9.1%
a7981
 
8.7%
t6906
 
7.5%
r6300
 
6.9%
o6107
 
6.7%
s5789
 
6.3%
l4376
 
4.8%
y3155
 
3.4%
Other values (37)21534
23.5%
Uppercase Letter
ValueCountFrequency (%)
U2851
21.7%
S1352
10.3%
C1236
 
9.4%
A857
 
6.5%
M785
 
6.0%
T762
 
5.8%
I708
 
5.4%
N522
 
4.0%
P492
 
3.7%
B422
 
3.2%
Other values (19)3169
24.1%
Decimal Number
ValueCountFrequency (%)
17
36.8%
73
15.8%
92
 
10.5%
42
 
10.5%
52
 
10.5%
31
 
5.3%
21
 
5.3%
81
 
5.3%
Other Punctuation
ValueCountFrequency (%)
,200
39.4%
.106
20.9%
'91
17.9%
"62
 
12.2%
&35
 
6.9%
/14
 
2.8%
Control
ValueCountFrequency (%)
“1
50.0%
”1
50.0%
Space Separator
ValueCountFrequency (%)
11198
100.0%
Dash Punctuation
ValueCountFrequency (%)
-186
100.0%
Open Punctuation
ValueCountFrequency (%)
(79
100.0%
Close Punctuation
ValueCountFrequency (%)
)79
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin104862
89.7%
Common12073
 
10.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i10758
 
10.3%
e10464
 
10.0%
n8336
 
7.9%
a7981
 
7.6%
t6906
 
6.6%
r6300
 
6.0%
o6107
 
5.8%
s5789
 
5.5%
l4376
 
4.2%
y3155
 
3.0%
Other values (66)34690
33.1%
Common
ValueCountFrequency (%)
11198
92.8%
,200
 
1.7%
-186
 
1.5%
.106
 
0.9%
'91
 
0.8%
(79
 
0.7%
)79
 
0.7%
"62
 
0.5%
&35
 
0.3%
/14
 
0.1%
Other values (12)23
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII116357
99.5%
Latin 1 Sup569
 
0.5%
Latin Ext A7
 
< 0.1%
Punctuation2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11198
 
9.6%
i10758
 
9.2%
e10464
 
9.0%
n8336
 
7.2%
a7981
 
6.9%
t6906
 
5.9%
r6300
 
5.4%
o6107
 
5.2%
s5789
 
5.0%
l4376
 
3.8%
Other values (60)38142
32.8%
Latin 1 Sup
ValueCountFrequency (%)
é211
37.1%
ó90
15.8%
ä65
 
11.4%
ü59
 
10.4%
á43
 
7.6%
í33
 
5.8%
è11
 
1.9%
ñ9
 
1.6%
ç9
 
1.6%
ú8
 
1.4%
Other values (12)31
 
5.4%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Latin Ext A
ValueCountFrequency (%)
č4
57.1%
Š1
 
14.3%
ı1
 
14.3%
ž1
 
14.3%

Loan Amnt
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct434
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13017.49937
Minimum1000
Maximum35000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:16.903438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile3000
Q16500
median12000
Q317625
95-th percentile30000
Maximum35000
Range34000
Interquartile range (IQR)11125

Descriptive statistics

Standard deviation8155.330342
Coefficient of variation (CV)0.6264897821
Kurtosis0.3258532123
Mean13017.49937
Median Absolute Deviation (MAD)5500
Skewness0.9233128761
Sum51458175
Variance66509412.98
MonotonicityNot monotonic
2021-06-30T18:00:17.049048image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12000315
 
8.0%
10000259
 
6.6%
15000190
 
4.8%
20000174
 
4.4%
6000165
 
4.2%
5000153
 
3.9%
35000143
 
3.6%
8000124
 
3.1%
1600099
 
2.5%
2500097
 
2.5%
Other values (424)2234
56.5%
ValueCountFrequency (%)
100021
0.5%
11001
 
< 0.1%
12009
0.2%
13002
 
0.1%
13251
 
< 0.1%
14003
 
0.1%
14502
 
0.1%
150011
0.3%
16006
 
0.2%
17001
 
< 0.1%
ValueCountFrequency (%)
35000143
3.6%
344751
 
< 0.1%
340002
 
0.1%
339501
 
< 0.1%
336002
 
0.1%
334251
 
< 0.1%
330001
 
< 0.1%
328751
 
< 0.1%
322751
 
< 0.1%
320003
 
0.1%

Funded amnt inv
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct828
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12809.79216
Minimum750
Maximum35000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:17.201640image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum750
5-th percentile3000
Q16500
median11775
Q317000
95-th percentile29735
Maximum35000
Range34250
Interquartile range (IQR)10500

Descriptive statistics

Standard deviation7935.907682
Coefficient of variation (CV)0.619518848
Kurtosis0.3951370723
Mean12809.79216
Median Absolute Deviation (MAD)5275
Skewness0.9263171893
Sum50637108.41
Variance62978630.74
MonotonicityNot monotonic
2021-06-30T18:00:17.343261image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12000249
 
6.3%
10000222
 
5.6%
6000153
 
3.9%
5000143
 
3.6%
15000139
 
3.5%
8000113
 
2.9%
700087
 
2.2%
300074
 
1.9%
2000072
 
1.8%
1400064
 
1.6%
Other values (818)2637
66.7%
ValueCountFrequency (%)
7501
 
< 0.1%
100020
0.5%
11001
 
< 0.1%
12009
0.2%
13002
 
0.1%
13251
 
< 0.1%
14003
 
0.1%
14502
 
0.1%
150011
0.3%
16006
 
0.2%
ValueCountFrequency (%)
3500037
0.9%
34997.352451
 
< 0.1%
34993.655391
 
< 0.1%
34987.984521
 
< 0.1%
34987.271011
 
< 0.1%
34977.346741
 
< 0.1%
34975.816361
 
< 0.1%
3497514
 
0.4%
34972.82951
 
< 0.1%
34972.503931
 
< 0.1%

TERM
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
36 months
2687 
60 months
1266 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters39530
Distinct characters10
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 36 months
2nd row 60 months
3rd row 36 months
4th row 36 months
5th row 60 months

Common Values

ValueCountFrequency (%)
36 months2687
68.0%
60 months1266
32.0%

Length

2021-06-30T18:00:17.594590image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:17.674376image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
months3953
50.0%
362687
34.0%
601266
 
16.0%

Most occurring characters

ValueCountFrequency (%)
7906
20.0%
63953
10.0%
m3953
10.0%
o3953
10.0%
n3953
10.0%
t3953
10.0%
h3953
10.0%
s3953
10.0%
32687
 
6.8%
01266
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter23718
60.0%
Space Separator7906
 
20.0%
Decimal Number7906
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
m3953
16.7%
o3953
16.7%
n3953
16.7%
t3953
16.7%
h3953
16.7%
s3953
16.7%
Decimal Number
ValueCountFrequency (%)
63953
50.0%
32687
34.0%
01266
 
16.0%
Space Separator
ValueCountFrequency (%)
7906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin23718
60.0%
Common15812
40.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
m3953
16.7%
o3953
16.7%
n3953
16.7%
t3953
16.7%
h3953
16.7%
s3953
16.7%
Common
ValueCountFrequency (%)
7906
50.0%
63953
25.0%
32687
 
17.0%
01266
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII39530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7906
20.0%
63953
10.0%
m3953
10.0%
o3953
10.0%
n3953
10.0%
t3953
10.0%
h3953
10.0%
s3953
10.0%
32687
 
6.8%
01266
 
3.2%

Int Rate
Real number (ℝ≥0)

HIGH CORRELATION

Distinct35
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1296908677
Minimum0.06
Maximum0.241
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:17.768126image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.06
5-th percentile0.066
Q10.099
median0.127
Q30.16
95-th percentile0.203
Maximum0.241
Range0.181
Interquartile range (IQR)0.061

Descriptive statistics

Standard deviation0.04160931484
Coefficient of variation (CV)0.3208345782
Kurtosis-0.6951924625
Mean0.1296908677
Median Absolute Deviation (MAD)0.033
Skewness0.226416223
Sum512.668
Variance0.001731335081
MonotonicityNot monotonic
2021-06-30T18:00:17.890797image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0.117324
 
8.2%
0.079259
 
6.6%
0.127259
 
6.6%
0.124254
 
6.4%
0.135231
 
5.8%
0.143226
 
5.7%
0.107213
 
5.4%
0.099211
 
5.3%
0.089198
 
5.0%
0.06160
 
4.0%
Other values (25)1618
40.9%
ValueCountFrequency (%)
0.06160
4.0%
0.066156
3.9%
0.075137
3.5%
0.079259
6.6%
0.089198
5.0%
0.099211
5.3%
0.107213
5.4%
0.117324
8.2%
0.124254
6.4%
0.127259
6.6%
ValueCountFrequency (%)
0.2412
 
0.1%
0.2396
 
0.2%
0.2356
 
0.2%
0.2314
 
0.1%
0.2276
 
0.2%
0.22415
 
0.4%
0.22119
0.5%
0.21724
0.6%
0.21328
0.7%
0.20939
1.0%

INSTALLMENT
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1923
Distinct (%)48.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean375.2073362
Minimum32.23
Maximum1283.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:18.026435image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum32.23
5-th percentile93.88
Q1205.86
median336
Q3494.59
95-th percentile813.626
Maximum1283.5
Range1251.27
Interquartile range (IQR)288.73

Descriptive statistics

Standard deviation220.261152
Coefficient of variation (CV)0.5870385006
Kurtosis0.8900854243
Mean375.2073362
Median Absolute Deviation (MAD)140.06
Skewness0.9837168213
Sum1483194.6
Variance48514.9751
MonotonicityNot monotonic
2021-06-30T18:00:18.167059image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
330.7627
 
0.7%
396.9225
 
0.6%
325.7422
 
0.6%
386.721
 
0.5%
339.3120
 
0.5%
334.1619
 
0.5%
322.2519
 
0.5%
343.0918
 
0.5%
190.5218
 
0.5%
368.4517
 
0.4%
Other values (1913)3747
94.8%
ValueCountFrequency (%)
32.231
 
< 0.1%
32.582
0.1%
33.082
0.1%
33.551
 
< 0.1%
33.943
0.1%
34.311
 
< 0.1%
34.53
0.1%
34.82
0.1%
35.141
 
< 0.1%
35.314
0.1%
ValueCountFrequency (%)
1283.51
 
< 0.1%
1276.61
 
< 0.1%
1269.731
 
< 0.1%
1243.851
 
< 0.1%
1222.031
 
< 0.1%
1203.661
 
< 0.1%
1200.822
 
0.1%
1157.661
 
< 0.1%
1142.941
 
< 0.1%
1140.075
0.1%

GRADE
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
B
1262 
A
908 
C
811 
D
510 
E
313 
Other values (2)
149 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3953
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB
2nd rowC
3rd rowC
4th rowC
5th rowB

Common Values

ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%

Length

2021-06-30T18:00:18.405421image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:18.478227image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
b1262
31.9%
a908
23.0%
c811
20.5%
d510
12.9%
e313
 
7.9%
f125
 
3.2%
g24
 
0.6%

Most occurring characters

ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3953
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin3953
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII3953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%

Sub Grade
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct35
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
B3
324 
B5
 
260
A4
 
259
B4
 
254
C1
 
231
Other values (30)
2625 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters7906
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowB2
2nd rowC4
3rd rowC5
4th rowC1
5th rowB5

Common Values

ValueCountFrequency (%)
B3324
 
8.2%
B5260
 
6.6%
A4259
 
6.6%
B4254
 
6.4%
C1231
 
5.8%
C2227
 
5.7%
B2213
 
5.4%
B1211
 
5.3%
A5198
 
5.0%
A1158
 
4.0%
Other values (25)1618
40.9%

Length

2021-06-30T18:00:18.714595image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
b3324
 
8.2%
b5260
 
6.6%
a4259
 
6.6%
b4254
 
6.4%
c1231
 
5.8%
c2227
 
5.7%
b2213
 
5.4%
b1211
 
5.3%
a5198
 
5.0%
a1158
 
4.0%
Other values (25)1618
40.9%

Most occurring characters

ValueCountFrequency (%)
B1262
16.0%
A908
11.5%
2832
10.5%
C811
10.3%
4806
10.2%
3803
10.2%
1797
10.1%
5715
9.0%
D510
6.5%
E313
 
4.0%
Other values (2)149
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter3953
50.0%
Decimal Number3953
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%
Decimal Number
ValueCountFrequency (%)
2832
21.0%
4806
20.4%
3803
20.3%
1797
20.2%
5715
18.1%

Most occurring scripts

ValueCountFrequency (%)
Latin3953
50.0%
Common3953
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B1262
31.9%
A908
23.0%
C811
20.5%
D510
12.9%
E313
 
7.9%
F125
 
3.2%
G24
 
0.6%
Common
ValueCountFrequency (%)
2832
21.0%
4806
20.4%
3803
20.3%
1797
20.2%
5715
18.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII7906
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B1262
16.0%
A908
11.5%
2832
10.5%
C811
10.3%
4806
10.2%
3803
10.2%
1797
10.1%
5715
9.0%
D510
6.5%
E313
 
4.0%
Other values (2)149
 
1.9%

Home Ownership
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
RENT
2081 
MORTGAGE
1577 
OWN
295 

Length

Max length8
Median length4
Mean length5.521123198
Min length3

Characters and Unicode

Total characters21825
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowRENT
2nd rowRENT
3rd rowRENT
4th rowRENT
5th rowRENT

Common Values

ValueCountFrequency (%)
RENT2081
52.6%
MORTGAGE1577
39.9%
OWN295
 
7.5%

Length

2021-06-30T18:00:18.934008image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:18.999832image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
rent2081
52.6%
mortgage1577
39.9%
own295
 
7.5%

Most occurring characters

ValueCountFrequency (%)
R3658
16.8%
E3658
16.8%
T3658
16.8%
G3154
14.5%
N2376
10.9%
O1872
8.6%
M1577
7.2%
A1577
7.2%
W295
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter21825
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
R3658
16.8%
E3658
16.8%
T3658
16.8%
G3154
14.5%
N2376
10.9%
O1872
8.6%
M1577
7.2%
A1577
7.2%
W295
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Latin21825
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
R3658
16.8%
E3658
16.8%
T3658
16.8%
G3154
14.5%
N2376
10.9%
O1872
8.6%
M1577
7.2%
A1577
7.2%
W295
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII21825
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
R3658
16.8%
E3658
16.8%
T3658
16.8%
G3154
14.5%
N2376
10.9%
O1872
8.6%
M1577
7.2%
A1577
7.2%
W295
 
1.4%

Annual Inc
Real number (ℝ≥0)

Distinct813
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66175.97354
Minimum8280
Maximum550000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:19.093582image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum8280
5-th percentile25000
Q140100
median57000
Q380000
95-th percentile135880
Maximum550000
Range541720
Interquartile range (IQR)39900

Descriptive statistics

Standard deviation40498.80417
Coefficient of variation (CV)0.6119865264
Kurtosis18.71426089
Mean66175.97354
Median Absolute Deviation (MAD)18000
Skewness3.058200935
Sum261593623.4
Variance1640153139
MonotonicityNot monotonic
2021-06-30T18:00:19.240190image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60000154
 
3.9%
50000149
 
3.8%
75000120
 
3.0%
40000120
 
3.0%
45000114
 
2.9%
7000096
 
2.4%
3000093
 
2.4%
8000093
 
2.4%
6500088
 
2.2%
3500082
 
2.1%
Other values (803)2844
71.9%
ValueCountFrequency (%)
82801
 
< 0.1%
84001
 
< 0.1%
96001
 
< 0.1%
99601
 
< 0.1%
100001
 
< 0.1%
110001
 
< 0.1%
113401
 
< 0.1%
118201
 
< 0.1%
120008
0.2%
122521
 
< 0.1%
ValueCountFrequency (%)
5500001
 
< 0.1%
5250001
 
< 0.1%
4080001
 
< 0.1%
4000002
 
0.1%
3650001
 
< 0.1%
3500001
 
< 0.1%
3250001
 
< 0.1%
3000005
0.1%
2900001
 
< 0.1%
2810001
 
< 0.1%

Verification Status
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
Verified
1515 
Not Verified
1247 
Source Verified
1191 

Length

Max length15
Median length12
Mean length11.37085758
Min length8

Characters and Unicode

Total characters44949
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVerified
2nd rowSource Verified
3rd rowNot Verified
4th rowSource Verified
5th rowSource Verified

Common Values

ValueCountFrequency (%)
Verified1515
38.3%
Not Verified1247
31.5%
Source Verified1191
30.1%

Length

2021-06-30T18:00:19.695971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:19.772766image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
verified3953
61.9%
not1247
 
19.5%
source1191
 
18.6%

Most occurring characters

ValueCountFrequency (%)
e9097
20.2%
i7906
17.6%
r5144
11.4%
V3953
8.8%
f3953
8.8%
d3953
8.8%
o2438
 
5.4%
2438
 
5.4%
N1247
 
2.8%
t1247
 
2.8%
Other values (3)3573
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter36120
80.4%
Uppercase Letter6391
 
14.2%
Space Separator2438
 
5.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e9097
25.2%
i7906
21.9%
r5144
14.2%
f3953
10.9%
d3953
10.9%
o2438
 
6.7%
t1247
 
3.5%
u1191
 
3.3%
c1191
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
V3953
61.9%
N1247
 
19.5%
S1191
 
18.6%
Space Separator
ValueCountFrequency (%)
2438
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin42511
94.6%
Common2438
 
5.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e9097
21.4%
i7906
18.6%
r5144
12.1%
V3953
9.3%
f3953
9.3%
d3953
9.3%
o2438
 
5.7%
N1247
 
2.9%
t1247
 
2.9%
S1191
 
2.8%
Other values (2)2382
 
5.6%
Common
ValueCountFrequency (%)
2438
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII44949
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e9097
20.2%
i7906
17.6%
r5144
11.4%
V3953
8.8%
f3953
8.8%
d3953
8.8%
o2438
 
5.4%
2438
 
5.4%
N1247
 
2.8%
t1247
 
2.8%
Other values (3)3573
 
7.9%

Loan Writeoff
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
0
3275 
1
678 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3953
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

Length

2021-06-30T18:00:19.947299image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:20.014121image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

Most occurring characters

ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3953
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

Most occurring scripts

ValueCountFrequency (%)
Common3953
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII3953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03275
82.8%
1678
 
17.2%

PURPOSE
Categorical

Distinct13
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
debt_consolidation
2102 
credit_card
792 
other
297 
home_improvement
 
196
small_business
 
145
Other values (8)
421 

Length

Max length18
Median length18
Mean length14.28307614
Min length3

Characters and Unicode

Total characters56461
Distinct characters22
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcredit_card
2nd rowcar
3rd rowsmall_business
4th rowother
5th rowother

Common Values

ValueCountFrequency (%)
debt_consolidation2102
53.2%
credit_card792
 
20.0%
other297
 
7.5%
home_improvement196
 
5.0%
small_business145
 
3.7%
major_purchase100
 
2.5%
car90
 
2.3%
wedding63
 
1.6%
medical52
 
1.3%
moving39
 
1.0%
Other values (3)77
 
1.9%

Length

2021-06-30T18:00:20.196632image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
debt_consolidation2102
53.2%
credit_card792
 
20.0%
other297
 
7.5%
home_improvement196
 
5.0%
small_business145
 
3.7%
major_purchase100
 
2.5%
car90
 
2.3%
wedding63
 
1.6%
medical52
 
1.3%
moving39
 
1.0%
Other values (3)77
 
1.9%

Most occurring characters

ValueCountFrequency (%)
o7205
12.8%
d5966
10.6%
i5525
9.8%
t5523
9.8%
n4693
8.3%
e4206
7.4%
c3962
7.0%
a3455
 
6.1%
_3341
 
5.9%
s2819
 
5.0%
Other values (12)9766
17.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter53120
94.1%
Connector Punctuation3341
 
5.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o7205
13.6%
d5966
11.2%
i5525
10.4%
t5523
10.4%
n4693
8.8%
e4206
7.9%
c3962
7.5%
a3455
6.5%
s2819
 
5.3%
l2450
 
4.6%
Other values (11)7316
13.8%
Connector Punctuation
ValueCountFrequency (%)
_3341
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin53120
94.1%
Common3341
 
5.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o7205
13.6%
d5966
11.2%
i5525
10.4%
t5523
10.4%
n4693
8.8%
e4206
7.9%
c3962
7.5%
a3455
6.5%
s2819
 
5.3%
l2450
 
4.6%
Other values (11)7316
13.8%
Common
ValueCountFrequency (%)
_3341
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII56461
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o7205
12.8%
d5966
10.6%
i5525
9.8%
t5523
9.8%
n4693
8.3%
e4206
7.4%
c3962
7.0%
a3455
 
6.1%
_3341
 
5.9%
s2819
 
5.0%
Other values (12)9766
17.3%

Zip Code
Categorical

HIGH CARDINALITY

Distinct615
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
900xx
 
55
606xx
 
55
100xx
 
54
112xx
 
50
945xx
 
49
Other values (610)
3690 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters19765
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique148 ?
Unique (%)3.7%

Sample

1st row860xx
2nd row309xx
3rd row606xx
4th row917xx
5th row972xx

Common Values

ValueCountFrequency (%)
900xx55
 
1.4%
606xx55
 
1.4%
100xx54
 
1.4%
112xx50
 
1.3%
945xx49
 
1.2%
070xx45
 
1.1%
331xx44
 
1.1%
750xx41
 
1.0%
300xx41
 
1.0%
113xx40
 
1.0%
Other values (605)3479
88.0%

Length

2021-06-30T18:00:20.483865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
900xx55
 
1.4%
606xx55
 
1.4%
100xx54
 
1.4%
112xx50
 
1.3%
945xx49
 
1.2%
070xx45
 
1.1%
331xx44
 
1.1%
750xx41
 
1.0%
300xx41
 
1.0%
113xx40
 
1.0%
Other values (605)3479
88.0%

Most occurring characters

ValueCountFrequency (%)
x7906
40.0%
01903
 
9.6%
11535
 
7.8%
91309
 
6.6%
21309
 
6.6%
31269
 
6.4%
71023
 
5.2%
5924
 
4.7%
4914
 
4.6%
8855
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number11859
60.0%
Lowercase Letter7906
40.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01903
16.0%
11535
12.9%
91309
11.0%
21309
11.0%
31269
10.7%
71023
8.6%
5924
7.8%
4914
7.7%
8855
7.2%
6818
6.9%
Lowercase Letter
ValueCountFrequency (%)
x7906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common11859
60.0%
Latin7906
40.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01903
16.0%
11535
12.9%
91309
11.0%
21309
11.0%
31269
10.7%
71023
8.6%
5924
7.8%
4914
7.7%
8855
7.2%
6818
6.9%
Latin
ValueCountFrequency (%)
x7906
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII19765
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
x7906
40.0%
01903
 
9.6%
11535
 
7.8%
91309
 
6.6%
21309
 
6.6%
31269
 
6.4%
71023
 
5.2%
5924
 
4.7%
4914
 
4.6%
8855
 
4.3%

Add State
Categorical

Distinct43
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
CA
729 
NY
372 
FL
304 
TX
273 
NJ
 
181
Other values (38)
2094 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters7906
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAZ
2nd rowGA
3rd rowIL
4th rowCA
5th rowOR

Common Values

ValueCountFrequency (%)
CA729
18.4%
NY372
 
9.4%
FL304
 
7.7%
TX273
 
6.9%
NJ181
 
4.6%
IL155
 
3.9%
GA146
 
3.7%
PA136
 
3.4%
VA130
 
3.3%
OH124
 
3.1%
Other values (33)1403
35.5%

Length

2021-06-30T18:00:20.742177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca729
18.4%
ny372
 
9.4%
fl304
 
7.7%
tx273
 
6.9%
nj181
 
4.6%
il155
 
3.9%
ga146
 
3.7%
pa136
 
3.4%
va130
 
3.3%
oh124
 
3.1%
Other values (33)1403
35.5%

Most occurring characters

ValueCountFrequency (%)
A1557
19.7%
C1032
13.1%
N803
10.2%
L537
 
6.8%
M413
 
5.2%
Y407
 
5.1%
T394
 
5.0%
O353
 
4.5%
I309
 
3.9%
F304
 
3.8%
Other values (14)1797
22.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter7906
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A1557
19.7%
C1032
13.1%
N803
10.2%
L537
 
6.8%
M413
 
5.2%
Y407
 
5.1%
T394
 
5.0%
O353
 
4.5%
I309
 
3.9%
F304
 
3.8%
Other values (14)1797
22.7%

Most occurring scripts

ValueCountFrequency (%)
Latin7906
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A1557
19.7%
C1032
13.1%
N803
10.2%
L537
 
6.8%
M413
 
5.2%
Y407
 
5.1%
T394
 
5.0%
O353
 
4.5%
I309
 
3.9%
F304
 
3.8%
Other values (14)1797
22.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII7906
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A1557
19.7%
C1032
13.1%
N803
10.2%
L537
 
6.8%
M413
 
5.2%
Y407
 
5.1%
T394
 
5.0%
O353
 
4.5%
I309
 
3.9%
F304
 
3.8%
Other values (14)1797
22.7%

DTI
Real number (ℝ≥0)

Distinct1961
Distinct (%)49.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.42828738
Minimum0
Maximum29.85
Zeros3
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:20.862851image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.932
Q19.58
median14.45
Q319.47
95-th percentile24.214
Maximum29.85
Range29.85
Interquartile range (IQR)9.89

Descriptive statistics

Standard deviation6.378445753
Coefficient of variation (CV)0.4420792008
Kurtosis-0.7703420751
Mean14.42828738
Median Absolute Deviation (MAD)4.94
Skewness-0.04903565752
Sum57035.02
Variance40.68457022
MonotonicityNot monotonic
2021-06-30T18:00:21.000484image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.89
 
0.2%
18.638
 
0.2%
20.888
 
0.2%
12.487
 
0.2%
9.657
 
0.2%
17.677
 
0.2%
19.637
 
0.2%
16.47
 
0.2%
16.27
 
0.2%
18.847
 
0.2%
Other values (1951)3879
98.1%
ValueCountFrequency (%)
03
0.1%
0.022
0.1%
0.071
 
< 0.1%
0.21
 
< 0.1%
0.251
 
< 0.1%
0.322
0.1%
0.341
 
< 0.1%
0.411
 
< 0.1%
0.551
 
< 0.1%
0.571
 
< 0.1%
ValueCountFrequency (%)
29.851
< 0.1%
29.831
< 0.1%
29.731
< 0.1%
29.721
< 0.1%
29.631
< 0.1%
29.442
0.1%
29.361
< 0.1%
29.351
< 0.1%
29.291
< 0.1%
29.261
< 0.1%

Delinq 2Yrs
Real number (ℝ≥0)

ZEROS

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1085251708
Minimum0
Maximum6
Zeros3628
Zeros (%)91.8%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:21.113182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4087983222
Coefficient of variation (CV)3.766852606
Kurtosis32.99870086
Mean0.1085251708
Median Absolute Deviation (MAD)0
Skewness4.954297207
Sum429
Variance0.1671160683
MonotonicityNot monotonic
2021-06-30T18:00:21.197961image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
03628
91.8%
1246
 
6.2%
261
 
1.5%
313
 
0.3%
44
 
0.1%
61
 
< 0.1%
ValueCountFrequency (%)
03628
91.8%
1246
 
6.2%
261
 
1.5%
313
 
0.3%
44
 
0.1%
61
 
< 0.1%
ValueCountFrequency (%)
61
 
< 0.1%
44
 
0.1%
313
 
0.3%
261
 
1.5%
1246
 
6.2%
03628
91.8%

Inq Last 6Mths
Real number (ℝ≥0)

ZEROS

Distinct9
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.8555527448
Minimum0
Maximum8
Zeros1822
Zeros (%)46.1%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:21.294696image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.997025005
Coefficient of variation (CV)1.165357731
Kurtosis2.163689287
Mean0.8555527448
Median Absolute Deviation (MAD)1
Skewness1.26526022
Sum3382
Variance0.9940588606
MonotonicityNot monotonic
2021-06-30T18:00:21.391438image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
01822
46.1%
11245
31.5%
2584
 
14.8%
3265
 
6.7%
421
 
0.5%
510
 
0.3%
63
 
0.1%
72
 
0.1%
81
 
< 0.1%
ValueCountFrequency (%)
01822
46.1%
11245
31.5%
2584
 
14.8%
3265
 
6.7%
421
 
0.5%
510
 
0.3%
63
 
0.1%
72
 
0.1%
81
 
< 0.1%
ValueCountFrequency (%)
81
 
< 0.1%
72
 
0.1%
63
 
0.1%
510
 
0.3%
421
 
0.5%
3265
 
6.7%
2584
 
14.8%
11245
31.5%
01822
46.1%

Pub Rec
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.0 KiB
0
3831 
1
 
120
2
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters3953
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Length

2021-06-30T18:00:21.675678image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-06-30T18:00:21.748485image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Most occurring characters

ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number3953
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common3953
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3953
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
03831
96.9%
1120
 
3.0%
22
 
0.1%

Revol Bal
Real number (ℝ≥0)

ZEROS

Distinct3672
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14367.44751
Minimum0
Maximum140967
Zeros42
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:21.842233image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1240.4
Q16352
median11449
Q318151
95-th percentile35148.4
Maximum140967
Range140967
Interquartile range (IQR)11799

Descriptive statistics

Standard deviation13468.63453
Coefficient of variation (CV)0.937441012
Kurtosis18.01764983
Mean14367.44751
Median Absolute Deviation (MAD)5657
Skewness3.322035836
Sum56794520
Variance181404116.1
MonotonicityNot monotonic
2021-06-30T18:00:21.975876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
042
 
1.1%
80323
 
0.1%
130343
 
0.1%
148483
 
0.1%
109803
 
0.1%
65653
 
0.1%
151833
 
0.1%
113383
 
0.1%
184673
 
0.1%
83573
 
0.1%
Other values (3662)3884
98.3%
ValueCountFrequency (%)
042
1.1%
31
 
< 0.1%
61
 
< 0.1%
81
 
< 0.1%
161
 
< 0.1%
251
 
< 0.1%
331
 
< 0.1%
412
 
0.1%
501
 
< 0.1%
621
 
< 0.1%
ValueCountFrequency (%)
1409671
< 0.1%
1319491
< 0.1%
1309201
< 0.1%
1247441
< 0.1%
1234161
< 0.1%
1205041
< 0.1%
1125221
< 0.1%
1108561
< 0.1%
1083391
< 0.1%
1064061
< 0.1%

Total Paymnt
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3710
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14435.06432
Minimum0
Maximum58886.47343
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size31.0 KiB
2021-06-30T18:00:22.119514image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2401.064047
Q16614.78722
median11907.35
Q319190.68001
95-th percentile35788.92425
Maximum58886.47343
Range58886.47343
Interquartile range (IQR)12575.89279

Descriptive statistics

Standard deviation10492.53033
Coefficient of variation (CV)0.7268779753
Kurtosis1.593830926
Mean14435.06432
Median Absolute Deviation (MAD)5937.176941
Skewness1.261678967
Sum57061809.25
Variance110093192.6
MonotonicityNot monotonic
2021-06-30T18:00:22.255132image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
14288.761698
 
0.2%
13148.137867
 
0.2%
11907.347327
 
0.2%
12029.457
 
0.2%
11600.986
 
0.2%
10956.775965
 
0.1%
9011.5574945
 
0.1%
14288.775
 
0.1%
11726.325
 
0.1%
13263.965
 
0.1%
Other values (3700)3893
98.5%
ValueCountFrequency (%)
02
0.1%
91.391
< 0.1%
151.81
< 0.1%
165.371
< 0.1%
203.551
< 0.1%
258.461
< 0.1%
262.71
< 0.1%
309.361
< 0.1%
328.011
< 0.1%
331.831
< 0.1%
ValueCountFrequency (%)
58886.473431
< 0.1%
58133.31991
< 0.1%
58090.952071
< 0.1%
58071.199821
< 0.1%
58071.199771
< 0.1%
57997.279951
< 0.1%
57143.259961
< 0.1%
57117.899951
< 0.1%
56681.88591
< 0.1%
56681.885851
< 0.1%

Interactions

2021-06-30T18:00:00.951088image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.089745image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.216379image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.342043image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.464736image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.587408image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.711077image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.844702image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:01.972358image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.177808image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.300480image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.425147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.550811image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.675478image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.795157image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:02.913840image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.032523image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.153200image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.277867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.402538image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.524208image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.656854image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.798475image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:03.927131image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.050802image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.172475image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.293174image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.414828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.540492image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.667153image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.792817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:04.912496image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.032176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.149862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.360300image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.489954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.615617image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.740283image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.870934image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:05.996598image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.115280image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.233985image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.357633image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.474321image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.589015image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.698721image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.808428image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:06.920129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.036817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.153505image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.268199image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.382891image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.502573image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.646188image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.800775image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:07.914471image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.025175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.136876image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.253564image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.369255image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.482950image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.601633image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.720316image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.838001image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:08.953692image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.069382image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.304754image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.416454image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.538129image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.665802image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.793447image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:09.919111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.045772image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.172434image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.293111image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.413788image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.534466image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.655164image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.779810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:10.904476image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.026151image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.150817image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.276481image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.400151image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.523820image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.659458image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.784125image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:11.902807image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.026477image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.152140image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.275810image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.393495image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.514172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.634850image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.752536image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.867229image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:12.980946image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:13.176403image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:13.315032image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-30T18:00:13.439698image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-06-30T18:00:22.383785image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-30T18:00:22.582254image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-30T18:00:22.775737image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-30T18:00:22.984179image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-06-30T18:00:23.205588image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-06-30T18:00:13.739897image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-30T18:00:14.519811image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-06-30T18:00:14.788093image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-06-30T18:00:14.928717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

NameEmail IDGenderDt_AppliedUniversityLoan AmntFunded amnt invTERMInt RateINSTALLMENTGRADESub GradeHome OwnershipAnnual IncVerification StatusLoan WriteoffPURPOSEZip CodeAdd StateDTIDelinq 2YrsInq Last 6MthsPub RecRevol BalTotal Paymnt
0Calley Gironcgiron0@ehow.comFemale01/01/81Warner Southern College50004975.036 months0.107162.87BB2RENT24000.0Verified0credit_card860xxAZ27.65010136485863.155187
1Linus Studlstud1@washington.eduMale02/01/81Shri Lal Bahadur Shastri Rashtriya Sanskrit Vidyapeetha25002500.060 months0.15359.83CC4RENT30000.0Source Verified1car309xxGA1.0005016871014.530000
2Lorelle Ambagelambage2@wix.comFemale03/01/81Technische Universität Bergakademie Freiberg24002400.036 months0.16084.33CC5RENT12252.0Not Verified0small_business606xxIL8.7202029563005.666844
3Anna-diane Larratalarrat3@economist.comFemale04/01/81Divine Word College of Legazpi1000010000.036 months0.135339.31CC1RENT49200.0Source Verified0other917xxCA20.00010559812231.890000
4Gill RuskeNaNFemale05/01/81East China Jiao Tong University30003000.060 months0.12767.79BB5RENT80000.0Source Verified0other972xxOR17.94000277834066.908161
5Evelyn MacFaulemacfaul5@theatlantic.comFemale06/01/81Ahmedabad University50005000.036 months0.079156.46AA4RENT36000.0Source Verified0wedding852xxAZ11.2003079635632.210000
6Ainslie Rainardarainard6@virginia.eduFemale07/01/81NaN70007000.060 months0.160170.08CC5RENT47004.0Not Verified0debt_consolidation280xxNC23.510101772610137.840010
7Emmott Hambyehamby7@prnewswire.comMale08/01/81Institute of Business Management30003000.036 months0.186109.43EE1RENT48000.0Source Verified0car900xxCA5.3502082213939.135294
8Shem Toomerstoomer8@home.plMale09/01/81Osaka University of Education56005600.060 months0.213152.39FF2OWN40000.0Source Verified1small_business958xxCA5.550205210647.500000
9Giana Aberhartgaberhart9@mozilla.comFemale10/01/81American Public University53755350.060 months0.127121.45BB5RENT15000.0Verified1other774xxTX18.0800092791484.590000

Last rows

NameEmail IDGenderDt_AppliedUniversityLoan AmntFunded amnt invTERMInt RateINSTALLMENTGRADESub GradeHome OwnershipAnnual IncVerification StatusLoan WriteoffPURPOSEZip CodeAdd StateDTIDelinq 2YrsInq Last 6MthsPub RecRevol BalTotal Paymnt
3943Merla Thebemthebeq7@cocolog-nifty.comFemale21/10/91North Eastern Hill University60006000.036 months0.163211.81DD1RENT39564.0Verified1debt_consolidation606xxIL23.7821020283388.960000
3944Marcellina Dinnegesmdinnegesq8@infoseek.co.jpFemale22/10/91Universidade Católica de Santos24002400.036 months0.11779.39BB3RENT39800.0Not Verified0other303xxGA14.32000154972836.660516
3945Way Symondswsymondsq9@mlb.comMale23/10/91American International University West Africa2500025000.060 months0.183638.25DD5MORTGAGE156000.0Source Verified0house944xxCA5.850001070937936.750000
3946Ailene MatejkaNaNFemale24/10/91Kaya University2000020000.036 months0.117661.52BB3RENT80700.0Verified0debt_consolidation946xxCA13.67010721123406.523000
3947Samuel OverelNaNMale25/10/91Northwestern University1200012000.060 months0.183306.36DD5MORTGAGE34000.0Not Verified1debt_consolidation177xxPA12.5600061149667.950000
3948Corbie Creeboeccreeboeqc@sitemeter.comMale26/10/91Shaheed Rajaei Teacher Training University1200012000.036 months0.135407.17CC1RENT125000.0Source Verified0wedding086xxNJ13.180104628614657.917650
3949Bobbe Ochterloniebochterlonieqd@ezinearticles.comFemale27/10/91Dhofar University1500015000.036 months0.124501.23BB4RENT72000.0Verified0debt_consolidation104xxNY7.470101214716729.253640
3950Corella Espositocespositoqe@macromedia.comFemale28/10/91University of Jan Evangelista Purkyne1200012000.036 months0.060365.23AA1OWN48000.0Not Verified0debt_consolidation365xxAL23.350002238513148.137860
3951Prince Dibdinpdibdinqf@businessinsider.comMale29/10/91College in Sládkovičovo1500015000.060 months0.160364.46CC5RENT50000.0Verified1debt_consolidation907xxCA18.26010979910883.540000
3952Georgette Warrattgwarrattqg@java.comFemale30/10/91Technical University of Lublin1500014975.060 months0.153358.98CC4MORTGAGE32976.0Not Verified1debt_consolidation177xxPA17.90010795611704.260000